Model-based Clustering with Noise: Bayesian Inference and Estimation

نویسندگان

  • Halima Bensmail
  • Jacqueline J. Meulman
چکیده

Bensmail, Celeux, Raftery and Robert (1997) introduced a new approach to cluster analysis based on geometric modeling based on the within-group covariance in a mixture of multivariate normal distributions using a fully Bayesian framework. This is a model-based methodology, where the covariance matrix structure is involved. Previously, similar structures were used (using a maximum likelihood approach) by Banfield and Raftery (1993) for clustering data where they restricted some parameters of the covariance matrix structure to be known. In the same framework, Dasgupta and Raftery (1998) used the same reparameterization to detect the features in a spatial point process using maximum likelihood approach. These approaches work well, but they have some limitations. These limitations include the fact that not all covariance structures were considered and some parameters of the covariance structures were fixed. This paper proposes a new way of overcoming the existing limitations. It generalizes the model used in the the previous approaches by introducing a more comprehensive portfolio of covariance matrix structures. Further, this paper proposes a Bayesian solution in the presence of the noise in clustering problems. The performance of the proposed method is first studied by simulation; the procedure is also applied to the analysis of data concerning species of butterflies and diabetes patients.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

ADAPTIVE NEURO FUZZY INFERENCE SYSTEM BASED ON FUZZY C–MEANS CLUSTERING ALGORITHM, A TECHNIQUE FOR ESTIMATION OF TBM PENETRATION RATE

The  tunnel  boring  machine  (TBM)  penetration  rate  estimation  is  one  of  the  crucial  and complex  tasks  encountered  frequently  to  excavate  the  mechanical  tunnels.  Estimating  the machine  penetration  rate  may  reduce  the  risks  related  to  high  capital  costs  typical  for excavation  operation.  Thus  establishing  a  relationship  between  rock  properties  and  TBM pe...

متن کامل

Likelihood-based inference for clustered line transect data

The uncertainty in estimation of spatial animal density from line transect surveys depends on the degree of spatial clustering in the animal population. To quantify the clustering we model line transect data as independent thinnings of spatial shot-noise Cox processes. Likelihood-based inference is implemented using Markov chain Monte Carlo (MCMC) methods to obtain efficient estimates of spatia...

متن کامل

Bayesian Inference for Spatial Beta Generalized Linear Mixed Models

In some applications, the response variable assumes values in the unit interval. The standard linear regression model is not appropriate for modelling this type of data because the normality assumption is not met. Alternatively, the beta regression model has been introduced to analyze such observations. A beta distribution represents a flexible density family on (0, 1) interval that covers symm...

متن کامل

Prediction of slope stability using adaptive neuro-fuzzy inference system based on clustering methods

Slope stability analysis is an enduring research topic in the engineering and academic sectors. Accurate prediction of the factor of safety (FOS) of slopes, their stability, and their performance is not an easy task. In this work, the adaptive neuro-fuzzy inference system (ANFIS) was utilized to build an estimation model for the prediction of FOS. Three ANFIS models were implemented including g...

متن کامل

A Surface Water Evaporation Estimation Model Using Bayesian Belief Networks with an Application to the Persian Gulf

Evaporation phenomena is a effective climate component on water resources management and has special importance in agriculture. In this paper, Bayesian belief networks (BBNs) as a non-linear modeling technique provide an evaporation estimation  method under uncertainty. As a case study, we estimated the surface water evaporation of the Persian Gulf and worked with a dataset of observations ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • J. Classification

دوره 20  شماره 

صفحات  -

تاریخ انتشار 2003